AITopics | separation power

Collaborating Authors

separation power

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

On Universality Classes of Equivariant Networks

Neural Information Processing SystemsJun-17-2026, 11:52:04 GMT

Equivariant neural networks provide a principled framework for incorporating symmetry into learning architectures and have been extensively analyzed through the lens of their separation power, that is, the ability to distinguish inputs modulo symmetry. This notion plays a central role in settings such as graph learning, where it is often formalized via the Weisfeiler-Leman hierarchy. In contrast, the universality of equivariant models--their capacity to approximate target functions--remains comparatively underexplored. In this work, we investigate the approximation power of equivariant neural networks beyond separation constraints. We show that separation power does not fully capture expressivity: models with identical separation power may differ in their approximation ability. To demonstrate this, we characterize the universality classes of shallow invariant networks, providing a general framework for understanding which functions these architectures can approximate. Since equivariant models reduce to invariant ones under projection, this analysis yields sufficient conditions under which shallow equivariant networks fail to be universal. Conversely, we identify settings where shallow models do achieve separation-constrained universality. These positive results, however, depend critically on structural properties of the symmetry group, such as the existence of adequate normal subgroups, which may not hold in important cases like permutation symmetry.

artificial intelligence, machine learning, representation, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.93)

Genre: Research Report > Experimental Study (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

On Universality Classes of Equivariant Networks

Neural Information Processing SystemsJun-12-2026, 07:36:26 GMT

Equivariant neural networks provide a principled framework for incorporating symmetry into learning architectures and have been extensively analyzed through the lens of their, that is, the ability to distinguish inputs modulo symmetry. This notion plays a central role in settings such as graph learning, where it is often formalized via the Weisfeilern-Leman hierarchy. In contrast, the of equivariant models--their capacity to approximate target functions--remains comparatively underexplored. In this work, we investigate the approximation power of equivariant neural networks beyond separation constraints. We show that separation power does not fully capture expressivity: models with identical separation power may differ in their approximation ability. To demonstrate this, we characterize the universality classes of shallow invariant networks, providing a general framework for understanding which functions these architectures can approximate. Since equivariant models reduce to invariant ones under projection, this analysis yields sufficient conditions under which shallow equivariant networks fail to be universal. Conversely, we identify settings where shallow models do achieve separation-constrained universality. These positive results, however, depend critically on structural properties of the symmetry group, such as the existence of adequate normal subgroups, which may not hold in important cases like permutation symmetry.

artificial intelligence, machine learning, proceedings, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.50)

Add feedback

On Universality of Deep Equivariant Networks

Pacini, Marco, Petrache, Mircea, Lepri, Bruno, Trivedi, Shubhendu, Walters, Robin

arXiv.org Machine LearningOct-20-2025

Universality results for equivariant neural networks remain rare. Those that do exist typically hold only in restrictive settings: either they rely on regular or higher-order tensor representations, leading to impractically high-dimensional hidden spaces, or they target specialized architectures, often confined to the invariant setting. This work develops a more general account. For invariant networks, we establish a universality theorem under separation constraints, showing that the addition of a fully connected readout layer secures approximation within the class of separation-constrained continuous functions. For equivariant networks, where results are even scarcer, we demonstrate that standard separability notions are inadequate and introduce the sharper criterion of $\textit{entry-wise separability}$. We show that with sufficient depth or with the addition of appropriate readout layers, equivariant networks attain universality within the entry-wise separable regime. Together with prior results showing the failure of universality for shallow models, our findings identify depth and readout layers as a decisive mechanism for universality, additionally offering a unified perspective that subsumes and extends earlier specialized results.

artificial intelligence, machine learning, universality, (16 more...)

arXiv.org Machine Learning

2510.15814

Country:

South America > Chile (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Towards Foundation Models for Experimental Readout Systems Combining Discrete and Continuous Data

Giroux, James, Fanelli, Cristiano

arXiv.org Artificial IntelligenceJul-21-2025

We present a (proto) Foundation Model for Nuclear Physics, capable of operating on low-level detector inputs from Imaging Cherenkov Detectors at the future Electron Ion Collider. Building upon established next-token prediction approaches, we aim to address potential challenges such as resolution loss from existing tokenization schemes and limited support for conditional generation. We propose four key innovations: (i) separate vocabularies for discrete and continuous variates, combined via Causal Multi-Head Cross-Attention (CMHCA), (ii) continuous kinematic conditioning through prepended context embeddings, (iii) scalable and simple, high-resolution continuous variate tokenization without joint vocabulary inflation, and (iv) class conditional generation through a Mixture of Experts. Our model enables fast, high-fidelity generation of pixel and time sequences for Cherenkov photons, validated through closure tests in the High Performance DIRC. We also show our model generalizes to reconstruction tasks such as pion/kaon identification, and noise filtering, in which we show its ability to leverage fine-tuning under specific objectives.

artificial intelligence, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2505.08736

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Vision (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

On Universality Classes of Equivariant Networks

Pacini, Marco, Santin, Gabriele, Lepri, Bruno, Trivedi, Shubhendu

arXiv.org Artificial IntelligenceJun-4-2025

Equivariant neural networks provide a principled framework for incorporating symmetry into learning architectures and have been extensively analyzed through the lens of their separation power, that is, the ability to distinguish inputs modulo symmetry. This notion plays a central role in settings such as graph learning, where it is often formalized via the Weisfeiler-Leman hierarchy. In contrast, the universality of equivariant models-their capacity to approximate target functions-remains comparatively underexplored. In this work, we investigate the approximation power of equivariant neural networks beyond separation constraints. We show that separation power does not fully capture expressivity: models with identical separation power may differ in their approximation ability. To demonstrate this, we characterize the universality classes of shallow invariant networks, providing a general framework for understanding which functions these architectures can approximate. Since equivariant models reduce to invariant ones under projection, this analysis yields sufficient conditions under which shallow equivariant networks fail to be universal. Conversely, we identify settings where shallow models do achieve separation-constrained universality. These positive results, however, depend critically on structural properties of the symmetry group, such as the existence of adequate normal subgroups, which may not hold in important cases like permutation symmetry.

artificial intelligence, international conference, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2506.02293

Country:

North America > United States (0.67)
Europe (0.46)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

On the Expressive Power of Subgraph Graph Neural Networks for Graphs with Bounded Cycles

Chen, Ziang, Zhang, Qiao, Wang, Runzhong

arXiv.org Artificial IntelligenceFeb-5-2025

Graph neural networks (GNNs) have been widely used in graph-related contexts. It is known that the separation power of GNNs is equivalent to that of the Weisfeiler-Lehman (WL) test; hence, GNNs are imperfect at identifying all non-isomorphic graphs, which severely limits their expressive power. This work investigates $k$-hop subgraph GNNs that aggregate information from neighbors with distances up to $k$ and incorporate the subgraph structure. We prove that under appropriate assumptions, the $k$-hop subgraph GNNs can approximate any permutation-invariant/equivariant continuous function over graphs without cycles of length greater than $2k+1$ within any error tolerance. We also provide an extension to $k$-hop GNNs without incorporating the subgraph structure. Our numerical experiments on established benchmarks and novel architectures validate our theory on the relationship between the information aggregation distance and the cycle size.

artificial intelligence, machine learning, vertex, (16 more...)

arXiv.org Artificial Intelligence

2502.03703

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > California > Los Angeles County > Chatsworth (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.87)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.61)

Add feedback

Separation Power of Equivariant Neural Networks

Pacini, Marco, Dong, Xiaowen, Lepri, Bruno, Santin, Gabriele

arXiv.org Artificial IntelligenceJun-13-2024

The separation power of a machine learning model refers to its capacity to distinguish distinct inputs, and it is often employed as a proxy for its expressivity. In this paper, we propose a theoretical framework to investigate the separation power of equivariant neural networks with point-wise activations. Using the proposed framework, we can derive an explicit description of inputs indistinguishable by a family of neural networks with given architecture, demonstrating that it remains unaffected by the choice of non-polynomial activation function employed. We are able to understand the role played by activation functions in separability. Indeed, we show that all non-polynomial activations, such as ReLU and sigmoid, are equivalent in terms of expressivity, and that they reach maximum discrimination capacity. We demonstrate how assessing the separation power of an equivariant neural network can be simplified to evaluating the separation power of minimal representations. We conclude by illustrating how these minimal components form a hierarchy in separation power.

architecture, neural network, separation power, (13 more...)

arXiv.org Artificial Intelligence

2406.08966

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Rethinking the Capacity of Graph Neural Networks for Branching Strategy

Chen, Ziang, Liu, Jialin, Chen, Xiaohan, Wang, Xinshang, Yin, Wotao

arXiv.org Artificial IntelligenceFeb-10-2024

Graph neural networks (GNNs) have been widely used to predict properties and heuristics of mixed-integer linear programs (MILPs) and hence accelerate MILP solvers. This paper investigates the capacity of GNNs to represent strong branching (SB) scores that provide an efficient strategy in the branch-and-bound algorithm. Although message-passing GNN (MP-GNN), as the simplest GNN structure, is frequently employed in the existing literature to learn SB scores, we prove a fundamental limitation in its expressive power -- there exist two MILP instances with different SB scores that cannot be distinguished by any MP-GNN, regardless of the number of parameters. In addition, we establish a universal approximation theorem for another GNN structure called the second-order folklore GNN (2-FGNN). We show that for any data distribution over MILPs, there always exists a 2-FGNN that can approximate the SB score with arbitrarily high accuracy and arbitrarily high probability. A small-scale numerical experiment is conducted to directly validate our theoretical findings.

neural network, sb score, theorem 4, (15 more...)

arXiv.org Artificial Intelligence

2402.07099

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > Washington > King County > Bellevue (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

On Representing Mixed-Integer Linear Programs by Graph Neural Networks

Chen, Ziang, Liu, Jialin, Wang, Xinshang, Lu, Jianfeng, Yin, Wotao

arXiv.org Artificial IntelligenceMay-25-2023

While Mixed-integer linear programming (MILP) is NP-hard in general, practical MILP has received roughly 100--fold speedup in the past twenty years. Still, many classes of MILPs quickly become unsolvable as their sizes increase, motivating researchers to seek new acceleration techniques for MILPs. With deep learning, they have obtained strong empirical results, and many results were obtained by applying graph neural networks (GNNs) to making decisions in various stages of MILP solution processes. This work discovers a fundamental limitation: there exist feasible and infeasible MILPs that all GNNs will, however, treat equally, indicating GNN's lacking power to express general MILPs. Then, we show that, by restricting the MILPs to unfoldable ones or by adding random features, there exist GNNs that can reliably predict MILP feasibility, optimal objective values, and optimal solutions up to prescribed precision. We conducted small-scale numerical experiments to validate our theoretical findings.

artificial intelligence, machine learning, milp, (16 more...)

arXiv.org Artificial Intelligence

2210.10759

Country:

North America > United States > Washington > King County > Bellevue (0.04)
North America > United States > North Carolina > Durham County > Durham (0.04)
Europe > France > Bourgogne-Franche-Comté > Doubs > Besançon (0.04)

Genre:

Research Report (0.50)
Workflow (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

On Representing Linear Programs by Graph Neural Networks

Chen, Ziang, Liu, Jialin, Wang, Xinshang, Lu, Jianfeng, Yin, Wotao

arXiv.org Artificial IntelligenceMay-25-2023

Learning to optimize is a rapidly growing area that aims to solve optimization problems or improve existing optimization algorithms using machine learning (ML). In particular, the graph neural network (GNN) is considered a suitable ML model for optimization problems whose variables and constraints are permutation--invariant, for example, the linear program (LP). While the literature has reported encouraging numerical results, this paper establishes the theoretical foundation of applying GNNs to solving LPs. Given any size limit of LPs, we construct a GNN that maps different LPs to different outputs. We show that properly built GNNs can reliably predict feasibility, boundedness, and an optimal solution for each LP in a broad class. Our proofs are based upon the recently--discovered connections between the Weisfeiler--Lehman isomorphism test and the GNN. To validate our results, we train a simple GNN and present its accuracy in mapping LPs to their feasibilities and solutions.

artificial intelligence, gnn, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2209.12288

Country:

North America > United States > Washington > King County > Bellevue (0.04)
North America > United States > North Carolina > Durham County > Durham (0.04)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)

Genre:

Research Report (0.70)
Overview (0.46)

Industry: Energy > Power Industry (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback